Play Selection in American Football: a Case Study in Neuro-dynamic Programming

نویسندگان

  • Stephen D. Patek
  • Dimitri P. Bertsekas
چکیده

We present a computational case study of neuro-dynamic programming, a recent class of reinforcement learning methods. We cast the problem of play selection in American football as a stochastic shortest path Markov Decision Problem (MDP). In particular, we consider the problem faced by a quarterback in attempting to maximize the net score of an o ensive drive. The resulting optimization problem serves as a medium-scale testbed for numerical algorithms based on policy iteration. The algorithms we consider evolve as a sequence of approximate policy evaluations and policy updates. An (exact) evaluation amounts to the computation of the reward-to-go function associated with the policy in question. Approximations of reward-to-go are obtained either as the solution or as a step toward the solution of a training problem involving simulated state/reward data pairs. Within this methodological framework there is a great deal of exibility. In specifying a particular algorithm, one must select a parametric form for esti-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Neuro-Fuzzy Based Algorithm for Online Dynamic Voltage Stability Status Prediction Using Wide-Area Phasor Measurements

In this paper, a novel neuro-fuzzy based method combined with a feature selection technique is proposed for online dynamic voltage stability status prediction of power system. This technique uses synchronized phasors measured by phasor measurement units (PMUs) in a wide-area measurement system. In order to minimize the number of neuro-fuzzy inputs, training time and complication of neuro-fuzzy ...

متن کامل

The effects of fifa 11+ for referees comprehensive of warm-up program on dynamic balance in Iranian football referees

The present study aimed to examine the effects of FIFA 11+ comprehensive warm-up program for referees on dynamic balance among male Iranian football referees and assistant referees. Fifty-two football referees and assistant referees who had no previous injury voluntarily participated in the present study. They were randomly assigned into an intervention group and a control one (26 participants ...

متن کامل

Do Firms Maximize? Evidence from Professional Football

This paper examines a single, narrow decision—the choice on fourth down in the National Football League between kicking and trying for a first down—as a case study of the standard view that competition in the goods, capital, and labor markets leads firms to make maximizing choices. Play-by-play data and dynamic programming are used to estimate the average payoffs to kicking and trying for a fir...

متن کامل

A dynamic programming approach for solving nonlinear knapsack problems

Nonlinear Knapsack Problems (NKP) are the alternative formulation for the multiple-choice knapsack problems. A powerful approach for solving NKP is dynamic programming which may obtain the global op-timal solution even in the case of discrete solution space for these problems. Despite the power of this solu-tion approach, it computationally performs very slowly when the solution space of the pr...

متن کامل

Feature Selection for Neuro-Dynamic Programming∗

Neuro-Dynamic Programming encompasses techniques from both reinforcement learning and approximate dynamic programming. Feature selection refers to the choice of basis that defines the function class that is required in the application of these techniques. This chapter reviews two popular approaches to neuro-dynamic programming, TDlearning and Q-learning. The main goal of the chapter is to demon...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001